Answer Search Indonesian Language Hadith Using Vector Space Model in PDF Document

نویسنده

  • Bagus Priambodo
چکیده

Digital text documents are spread in various formats, the most widely used formats today include word format, and PDF format. This research will try to make text search application in text document using vector space approach model. The document format used is a PDF document. Text in PDF will be extracted and then made rank using vector space model. The PDF document consists of ten pages and each page contains a hadith. In general the system can search from the PDF document quite well and able to display the list of results in accordance with the relevance rank with the question. Keywords— answer retrieval, vector spaces model, text mining

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Indonesian-English Cross Language Question Answering

Our Indonesian-English Cross Language Question Answering (CLQA) is divided into 4 components: question analyzer, keyword translator, passage retriever and answer finder component. The Indonesian question is inputted into a question analyzer which yields Indonesian keyword list, Indonesian question focus and question class. We defined the question class by using an SVM machine implemented in Wek...

متن کامل

A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System

We have built a CLQA (Cross Language Question Answering) system for a source language with limited data resources (e.g. Indonesian) using a machine learning approach. The CLQA system consists of four modules: question analyzer, keyword translator, passage retriever and answer finder. We used machine learning in two modules, the question classifier (part of the question analyzer) and the answer ...

متن کامل

Classification of Hadiths using LVQ based on VSM Considering Words Orde

The religion of Islam is based on a sacred text called Qur‟an, a divine speech expressed in Arabic language. Qur‟an constitutes the main root of Islam jurisprudence which has a second source of inspiration known as Hadiths. As the Muslim‟s life is governed by those holy texts, need of their authenticity is required. Using VSM (Vector Space Model), we can represent Hadiths as a vector of words. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017